6 research outputs found

    Biological Applications of Knowledge Graph Embedding Models

    Get PDF
    Complex biological systems are traditionally modelled as graphs of interconnected biological entities. These graphs, i.e. biological knowledge graphs, are then processed using graph exploratory approaches to perform different types of analytical and predictive tasks. Despite the high predictive accuracy of these approaches, they have limited scalability due to their dependency on time-consuming path exploratory procedures. In recent years, owing to the rapid advances of computational technologies, new approaches for modelling graphs and mining them with high accuracy and scalability have emerged. These approaches, i.e. knowledge graph embedding (KGE) models, operate by learning low-rank vector representations of graph nodes and edges that preserve the graph s inherent structure. These approaches were used to analyse knowledge graphs from different domains where they showed superior performance and accuracy compared to previous graph exploratory approaches. In this work, we study this class of models in the context of biological knowledge graphs and their different applications. We then show how KGE models can be a natural fit for representing complex biological knowledge modelled as graphs. We also discuss their predictive and analytical capabilities in different biology applications. In this regard, we present two example case studies that demonstrate the capabilities of KGE models: prediction of drug target interactions and polypharmacy side effects. Finally, we analyse different practical considerations for KGEs, and we discuss possible opportunities and challenges related to adopting them for modelling biological systems.The work presented in this paper was supported by the CLARIFY project that has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 875160, and by Insight research centre supported by the Science Foundation Ireland (SFI) grant (12/RC/2289_2)peer-reviewed2021-02-1

    Salicylic Acid and Risk of Colorectal Cancer: A Two-Sample Mendelian Randomization Study.

    Get PDF
    Salicylic acid (SA) has observationally been shown to decrease colorectal cancer (CRC) risk. Aspirin (acetylsalicylic acid, that rapidly deacetylates to SA) is an effective primary and secondary chemopreventive agent. Through a Mendelian randomization (MR) approach, we aimed to address whether levels of SA affected CRC risk, stratifying by aspirin use. A two-sample MR analysis was performed using GWAS summary statistics of SA (INTERVAL and EPIC-Norfolk, N = 14,149) and CRC (CCFR, CORECT, GECCO and UK Biobank, 55,168 cases and 65,160 controls). The DACHS study (4410 cases and 3441 controls) was used for replication and stratification of aspirin-use. SNPs proxying SA were selected via three methods: (1) functional SNPs that influence the activity of aspirin-metabolising enzymes; (2) pathway SNPs present in enzymes' coding regions; and (3) genome-wide significant SNPs. We found no association between functional SNPs and SA levels. The pathway and genome-wide SNPs showed no association between SA and CRC risk (OR: 1.03, 95% CI: 0.84-1.27 and OR: 1.08, 95% CI: 0.86-1.34, respectively). Results remained unchanged upon aspirin use stratification. We found little evidence to suggest that an SD increase in genetically predicted SA protects against CRC risk in the general population and upon stratification by aspirin use

    A combined proteomics and Mendelian randomization approach to investigate the effects of aspirin-targeted proteins on colorectal cancer

    No full text
    Background: Evidence for aspirin’s chemopreventative properties on colorectal cancer (CRC) is substantial, but its mechanism of action is not well-understood. We combined a proteomic approach with Mendelian randomization (MR) to identify possible new aspirin targets that decrease CRC risk. Methods: Human colorectal adenoma cells (RG/C2) were treated with aspirin (24 hours) and a stable isotope labeling with amino acids in cell culture (SILAC) based proteomics approach identified altered protein expression. Protein quantitative trait loci (pQTLs) from INTERVAL (N=3,301) and expression QTLs (eQTLs) from the eQTLGen Consortium (N=31,684) were used as genetic proxies for protein and mRNA expression levels. Two-sample MR of mRNA/protein expression on CRC risk was performed using eQTL/pQTL data combined with CRC genetic summary data from the Colon Cancer Family Registry (CCFR), Colorectal Transdisciplinary (CORECT), Genetics and Epidemiology of Colorectal Cancer (GECCO) consortia and UK Biobank (55,168 cases and 65,160 controls). Results: Altered expression was detected for 125/5886 proteins. Of these, aspirin decreased MCM6, RRM2, and ARFIP2 expression, and MR analysis showed that a standard deviation increase in mRNA/protein expression was associated with increased CRC risk (OR: 1.08, 95% CI, 1.03–1.13; OR: 3.33, 95% CI, 2.46–4.50; and OR: 1.15, 95% CI, 1.02–1.29, respectively). Conclusions: MCM6 and RRM2 are involved in DNA repair whereby reduced expression may lead to increased DNA aberrations and ultimately cancer cell death, whereas ARFIP2 is involved in actin cytoskeletal regulation, indicating a possible role in aspirin’s reduction of metastasis. Impact: Our approach has shown how laboratory experiments and population-based approaches can combine to identify aspirin-targeted proteins possibly affecting CRC risk

    Association Between Telomere Length And Risk Of Cancer And Non-neoplastic Diseases: A Mendelian Randomization Study

    No full text
    IMPORTANCE The causal direction and magnitude of the association between telomere length and incidence of cancer and non-neoplastic diseases is uncertain owing to the susceptibility of observational studies to confounding and reverse causation. OBJECTIVE To conduct a Mendelian randomization study, using germline genetic variants as instrumental variables, to appraise the causal relevance of telomere length for risk of cancer and non-neoplastic diseases. DATA SOURCES Genomewide association studies (GWAS) published up to January 15, 2015. STUDY SELECTION GWAS of noncommunicable diseases that assayed germline genetic variation and did not select cohort or control participants on the basis of preexisting diseases. Of 163 GWAS of noncommunicable diseases identified, summary data from 103 were available. DATA EXTRACTION AND SYNTHESIS Summary association statistics for single nucleotide polymorphisms (SNPs) that are strongly associated with telomere length in the general population. MAIN OUTCOMES AND MEASURES Odds ratios (ORs) and 95% confidence intervals (CIs) for disease per standard deviation (SD) higher telomere length due to germline genetic variation. RESULTS Summary data were available for 35 cancers and 48 non-neoplastic diseases, corresponding to 420 081 cases (median cases, 2526 per disease) and 1 093 105 controls (median, 6789 per disease). Increased telomere length due to germline genetic variation was generally associated with increased risk for site-specific cancers. The strongest associations (ORs [ 95% CIs] per 1-SD change in genetically increased telomere length) were observed for glioma, 5.27 (3.15-8.81); serous low-malignant-potential ovarian cancer, 4.35 (2.39-7.94); lung adenocarcinoma, 3.19 (2.40-4.22); neuroblastoma, 2.98 (1.92-4.62); bladder cancer, 2.19 (1.32-3.66); melanoma, 1.87 (1.55-2.26); testicular cancer, 1.76 (1.02-3.04); kidney cancer, 1.55 (1.08-2.23); and endometrial cancer, 1.31 (1.07-1.61). Associations were stronger for rarer cancers and at tissue sites with lower rates of stem cell division. There was generally little evidence of association between genetically increased telomere length and risk of psychiatric, autoimmune, inflammatory, diabetic, and other non-neoplastic diseases, except for coronary heart disease (OR, 0.78 [ 95% CI, 0.67-0.90]), abdominal aortic aneurysm (OR, 0.63 [ 95% CI, 0.49-0.81]), celiac disease (OR, 0.42 [ 95% CI, 0.28-0.61]) and interstitial lung disease (OR, 0.09 [ 95% CI, 0.05-0.15]). CONCLUSIONS AND RELEVANCE It is likely that longer telomeres increase risk for several cancers but reduce risk for some non-neoplastic diseases, including cardiovascular diseases
    corecore